Overview

Dataset statistics

Number of variables12
Number of observations2257
Missing cells0
Missing cells (%)0.0%
Duplicate rows755
Duplicate rows (%)33.5%
Total size in memory211.7 KiB
Average record size in memory96.1 B

Variable types

Categorical3
Numeric9

Alerts

Dataset has 755 (33.5%) duplicate rowsDuplicates
Cement_O.P.C_(Kgperm3) is highly correlated with Type_of_course_Aggregate and 4 other fieldsHigh correlation
WaterCement_Ratio is highly correlated with Type_of_course_Aggregate and 2 other fieldsHigh correlation
Water_Content_(Kgperm3) is highly correlated with WaterCement_RatioHigh correlation
Total_Aggregate_(Kgperm3) is highly correlated with Type_of_Fine_Aggregate_ and 1 other fieldsHigh correlation
Fine_Aggregate_(Kgperm3) is highly correlated with Max._Size_of_Coarse_Aggregate_(mm) and 2 other fieldsHigh correlation
Coarse_Aggregate_(Kgperm3) is highly correlated with Fine_Aggregate_(Kgperm3)High correlation
Hardened_Concrete_Desnity_(avg.) is highly correlated with Total_Aggregate_(Kgperm3)High correlation
Type_of_course_Aggregate is highly correlated with Type_of_Fine_Aggregate_ and 2 other fieldsHigh correlation
Type_of_Fine_Aggregate_ is highly correlated with Type_of_course_Aggregate and 2 other fieldsHigh correlation
Max._Size_of_Coarse_Aggregate_(mm) is highly correlated with Fine_Aggregate_(Kgperm3)High correlation
Hardened_Concrete_Desnity_(avg.) is highly skewed (γ1 = 45.50589334) Skewed

Reproduction

Analysis started2022-11-08 17:58:31.476845
Analysis finished2022-11-08 17:59:19.441627
Duration47.96 seconds
Software versionpandas-profiling v3.4.0
Download configurationconfig.json

Variables

Type_of_course_Aggregate
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.8 KiB
0
1289 
Natural
598 
Crushed
370 

Length

Max length7
Median length1
Mean length3.573327426
Min length1

Characters and Unicode

Total characters8065
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCrushed
2nd rowCrushed
3rd rowCrushed
4th rowNatural
5th rowNatural

Common Values

ValueCountFrequency (%)
01289
57.1%
Natural598
26.5%
Crushed370
 
16.4%

Length

2022-11-08T23:29:19.668566image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T23:29:19.944578image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01289
57.1%
natural598
26.5%
crushed370
 
16.4%

Most occurring characters

ValueCountFrequency (%)
01289
16.0%
a1196
14.8%
u968
12.0%
r968
12.0%
N598
7.4%
t598
7.4%
l598
7.4%
C370
 
4.6%
s370
 
4.6%
h370
 
4.6%
Other values (2)740
9.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5808
72.0%
Decimal Number1289
 
16.0%
Uppercase Letter968
 
12.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1196
20.6%
u968
16.7%
r968
16.7%
t598
10.3%
l598
10.3%
s370
 
6.4%
h370
 
6.4%
e370
 
6.4%
d370
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
N598
61.8%
C370
38.2%
Decimal Number
ValueCountFrequency (%)
01289
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin6776
84.0%
Common1289
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1196
17.7%
u968
14.3%
r968
14.3%
N598
8.8%
t598
8.8%
l598
8.8%
C370
 
5.5%
s370
 
5.5%
h370
 
5.5%
e370
 
5.5%
Common
ValueCountFrequency (%)
01289
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII8065
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01289
16.0%
a1196
14.8%
u968
12.0%
r968
12.0%
N598
7.4%
t598
7.4%
l598
7.4%
C370
 
4.6%
s370
 
4.6%
h370
 
4.6%
Other values (2)740
9.2%

Type_of_Fine_Aggregate_
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size17.8 KiB
0
1289 
Natural
955 
Crushed
 
12
40
 
1

Length

Max length7
Median length1
Mean length3.571112096
Min length1

Characters and Unicode

Total characters8060
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNatural
2nd rowNatural
3rd rowNatural
4th rowNatural
5th rowNatural

Common Values

ValueCountFrequency (%)
01289
57.1%
Natural955
42.3%
Crushed12
 
0.5%
401
 
< 0.1%

Length

2022-11-08T23:29:20.144335image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T23:29:20.456812image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
01289
57.1%
natural955
42.3%
crushed12
 
0.5%
401
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a1910
23.7%
01290
16.0%
u967
12.0%
r967
12.0%
N955
11.8%
t955
11.8%
l955
11.8%
C12
 
0.1%
s12
 
0.1%
h12
 
0.1%
Other values (3)25
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter5802
72.0%
Decimal Number1291
 
16.0%
Uppercase Letter967
 
12.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a1910
32.9%
u967
16.7%
r967
16.7%
t955
16.5%
l955
16.5%
s12
 
0.2%
h12
 
0.2%
e12
 
0.2%
d12
 
0.2%
Decimal Number
ValueCountFrequency (%)
01290
99.9%
41
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N955
98.8%
C12
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Latin6769
84.0%
Common1291
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a1910
28.2%
u967
14.3%
r967
14.3%
N955
14.1%
t955
14.1%
l955
14.1%
C12
 
0.2%
s12
 
0.2%
h12
 
0.2%
e12
 
0.2%
Common
ValueCountFrequency (%)
01290
99.9%
41
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII8060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a1910
23.7%
01290
16.0%
u967
12.0%
r967
12.0%
N955
11.8%
t955
11.8%
l955
11.8%
C12
 
0.1%
s12
 
0.1%
h12
 
0.1%
Other values (3)25
 
0.3%

Max._Size_of_Coarse_Aggregate_(mm)
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.8 KiB
20
1547 
40
710 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters4514
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row40
2nd row40
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
201547
68.5%
40710
31.5%

Length

2022-11-08T23:29:20.646527image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T23:29:20.877332image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
201547
68.5%
40710
31.5%

Most occurring characters

ValueCountFrequency (%)
02257
50.0%
21547
34.3%
4710
 
15.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4514
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
02257
50.0%
21547
34.3%
4710
 
15.7%

Most occurring scripts

ValueCountFrequency (%)
Common4514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
02257
50.0%
21547
34.3%
4710
 
15.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII4514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
02257
50.0%
21547
34.3%
4710
 
15.7%

Cement_O.P.C_(Kgperm3)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct23
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean362.7204253
Minimum220
Maximum450
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:21.080464image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum220
5-th percentile330
Q1350
median365
Q3375
95-th percentile400
Maximum450
Range230
Interquartile range (IQR)25

Descriptive statistics

Standard deviation20.26800255
Coefficient of variation (CV)0.0558777536
Kurtosis4.293691674
Mean362.7204253
Median Absolute Deviation (MAD)15
Skewness-0.4009837665
Sum818660
Variance410.7919275
MonotonicityNot monotonic
2022-11-08T23:29:21.352303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
350727
32.2%
375621
27.5%
365213
 
9.4%
400173
 
7.7%
360126
 
5.6%
34087
 
3.9%
37059
 
2.6%
32551
 
2.3%
38039
 
1.7%
38534
 
1.5%
Other values (13)127
 
5.6%
ValueCountFrequency (%)
2202
 
0.1%
2504
 
0.2%
30013
 
0.6%
3109
 
0.4%
3153
 
0.1%
32010
 
0.4%
32551
2.3%
33025
 
1.1%
33526
 
1.2%
34087
3.9%
ValueCountFrequency (%)
4506
 
0.3%
41015
 
0.7%
400173
 
7.7%
3904
 
0.2%
38534
 
1.5%
38039
 
1.7%
375621
27.5%
37059
 
2.6%
365213
 
9.4%
360126
 
5.6%

WaterCement_Ratio
Real number (ℝ≥0)

HIGH CORRELATION

Distinct21
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4893176783
Minimum0.33
Maximum0.62
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:21.677685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.33
5-th percentile0.44
Q10.47
median0.49
Q30.51
95-th percentile0.54
Maximum0.62
Range0.29
Interquartile range (IQR)0.04

Descriptive statistics

Standard deviation0.03223190861
Coefficient of variation (CV)0.06587113044
Kurtosis1.742037642
Mean0.4893176783
Median Absolute Deviation (MAD)0.02
Skewness-0.1263135102
Sum1104.39
Variance0.001038895933
MonotonicityNot monotonic
2022-11-08T23:29:21.900880image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0.51358
15.9%
0.5307
13.6%
0.48290
12.8%
0.49240
10.6%
0.47195
8.6%
0.45189
8.4%
0.46151
6.7%
0.54110
 
4.9%
0.44108
 
4.8%
0.53101
 
4.5%
Other values (11)208
9.2%
ValueCountFrequency (%)
0.336
 
0.3%
0.411
 
0.5%
0.419
 
0.4%
0.4215
 
0.7%
0.4315
 
0.7%
0.44108
 
4.8%
0.45189
8.4%
0.46151
6.7%
0.47195
8.6%
0.48290
12.8%
ValueCountFrequency (%)
0.623
 
0.1%
0.612
 
0.5%
0.573
 
0.1%
0.5622
 
1.0%
0.5525
 
1.1%
0.54110
 
4.9%
0.53101
 
4.5%
0.5287
 
3.9%
0.51358
15.9%
0.5307
13.6%

Water_Content_(Kgperm3)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct27
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean177.8529021
Minimum100
Maximum290
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:22.423890image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile160
Q1170
median180
Q3185
95-th percentile195
Maximum290
Range190
Interquartile range (IQR)15

Descriptive statistics

Standard deviation13.36793764
Coefficient of variation (CV)0.07516288735
Kurtosis12.46891977
Mean177.8529021
Median Absolute Deviation (MAD)10
Skewness1.184488228
Sum401414
Variance178.7017569
MonotonicityNot monotonic
2022-11-08T23:29:22.764022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
180431
19.1%
190367
16.3%
185296
13.1%
165281
12.5%
175256
11.3%
170236
10.5%
160184
8.2%
16836
 
1.6%
19528
 
1.2%
21022
 
1.0%
Other values (17)120
 
5.3%
ValueCountFrequency (%)
1003
 
0.1%
1072
 
0.1%
1456
 
0.3%
1506
 
0.3%
1538
 
0.4%
1557
 
0.3%
1573
 
0.1%
160184
8.2%
1622
 
0.1%
165281
12.5%
ValueCountFrequency (%)
2905
 
0.2%
2256
 
0.3%
21520
 
0.9%
21022
 
1.0%
20519
 
0.8%
20015
 
0.7%
19528
 
1.2%
190367
16.3%
185296
13.1%
1842
 
0.1%

Total_Aggregate_(Kgperm3)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct46
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1875.70226
Minimum1185
Maximum1980
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:23.233271image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1185
5-th percentile1825
Q11860
median1880
Q31895
95-th percentile1916
Maximum1980
Range795
Interquartile range (IQR)35

Descriptive statistics

Standard deviation42.580066
Coefficient of variation (CV)0.02270086619
Kurtosis113.5267728
Mean1875.70226
Median Absolute Deviation (MAD)20
Skewness-7.867644278
Sum4233460
Variance1813.062021
MonotonicityNot monotonic
2022-11-08T23:29:23.633292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
1890202
 
8.9%
1875171
 
7.6%
1870165
 
7.3%
1885155
 
6.9%
1855143
 
6.3%
1880132
 
5.8%
1910122
 
5.4%
1860121
 
5.4%
1900112
 
5.0%
1915112
 
5.0%
Other values (36)822
36.4%
ValueCountFrequency (%)
11852
 
0.1%
13053
 
0.1%
13401
 
< 0.1%
17754
 
0.2%
17858
 
0.4%
17952
 
0.1%
18007
 
0.3%
180516
0.7%
181020
0.9%
181513
0.6%
ValueCountFrequency (%)
198012
 
0.5%
19704
 
0.2%
19653
 
0.1%
194013
 
0.6%
19353
 
0.1%
193014
 
0.6%
192515
 
0.7%
19233
 
0.1%
192046
2.0%
1915112
5.0%

Fine_Aggregate_(Kgperm3)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct79
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean715.7727071
Minimum275
Maximum865
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:24.003132image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum275
5-th percentile610
Q1680
median715
Q3760
95-th percentile815
Maximum865
Range590
Interquartile range (IQR)80

Descriptive statistics

Standard deviation63.79362782
Coefficient of variation (CV)0.08912553829
Kurtosis2.80833666
Mean715.7727071
Median Absolute Deviation (MAD)40
Skewness-0.6932806605
Sum1615499
Variance4069.62695
MonotonicityNot monotonic
2022-11-08T23:29:24.348421image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
760130
 
5.8%
715123
 
5.4%
700122
 
5.4%
675114
 
5.1%
710109
 
4.8%
72596
 
4.3%
69594
 
4.2%
75094
 
4.2%
69071
 
3.1%
77067
 
3.0%
Other values (69)1237
54.8%
ValueCountFrequency (%)
2751
 
< 0.1%
3604
0.2%
4772
 
0.1%
4902
 
0.1%
5052
 
0.1%
5102
 
0.1%
5253
 
0.1%
5308
0.4%
5406
0.3%
5503
 
0.1%
ValueCountFrequency (%)
86514
0.6%
8608
 
0.4%
8553
 
0.1%
8506
 
0.3%
8423
 
0.1%
84015
0.7%
83512
0.5%
83021
0.9%
8256
 
0.3%
82018
0.8%

Coarse_Aggregate_(Kgperm3)
Real number (ℝ≥0)

HIGH CORRELATION

Distinct101
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1154.162605
Minimum475
Maximum1510
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:24.675744image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum475
5-th percentile1055
Q11128
median1160
Q31195
95-th percentile1250
Maximum1510
Range1035
Interquartile range (IQR)67

Descriptive statistics

Standard deviation77.72022965
Coefficient of variation (CV)0.0673390641
Kurtosis16.42015097
Mean1154.162605
Median Absolute Deviation (MAD)35
Skewness-2.807374046
Sum2604945
Variance6040.434097
MonotonicityNot monotonic
2022-11-08T23:29:25.006222image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1200140
 
6.2%
1160120
 
5.3%
1170116
 
5.1%
1150111
 
4.9%
118098
 
4.3%
119080
 
3.5%
118579
 
3.5%
114579
 
3.5%
120575
 
3.3%
114074
 
3.3%
Other values (91)1285
56.9%
ValueCountFrequency (%)
4751
 
< 0.1%
6351
 
< 0.1%
6422
0.1%
6603
0.1%
6701
 
< 0.1%
6802
0.1%
6852
0.1%
6901
 
< 0.1%
6962
0.1%
7051
 
< 0.1%
ValueCountFrequency (%)
15101
 
< 0.1%
13654
0.2%
13453
 
0.1%
13354
0.2%
13302
 
0.1%
13205
0.2%
13156
0.3%
13058
0.4%
12952
 
0.1%
12903
 
0.1%

Workability_Slump_(mm)
Real number (ℝ≥0)

Distinct38
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.1626052
Minimum0
Maximum230
Zeros5
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:25.703326image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile90
Q1120
median150
Q3180
95-th percentile205
Maximum230
Range230
Interquartile range (IQR)60

Descriptive statistics

Standard deviation38.62195915
Coefficient of variation (CV)0.2572009129
Kurtosis-0.5094335418
Mean150.1626052
Median Absolute Deviation (MAD)30
Skewness-0.1613442356
Sum338917
Variance1491.655729
MonotonicityNot monotonic
2022-11-08T23:29:26.212652image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
150291
12.9%
200248
 
11.0%
120195
 
8.6%
100167
 
7.4%
170145
 
6.4%
140124
 
5.5%
130103
 
4.6%
18099
 
4.4%
16092
 
4.1%
19087
 
3.9%
Other values (28)706
31.3%
ValueCountFrequency (%)
05
 
0.2%
601
 
< 0.1%
654
 
0.2%
7022
 
1.0%
7510
 
0.4%
8049
 
2.2%
858
 
0.4%
9049
 
2.2%
9521
 
0.9%
100167
7.4%
ValueCountFrequency (%)
23018
 
0.8%
2252
 
0.1%
22049
 
2.2%
2156
 
0.3%
21036
 
1.6%
20515
 
0.7%
200248
11.0%
19528
 
1.2%
19087
 
3.9%
18539
 
1.7%

Hardened_Concrete_Desnity_(avg.)
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct143
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2429.733846
Minimum202
Maximum24003
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:26.761212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum202
5-th percentile2386
Q12403
median2407
Q32452.67
95-th percentile2459
Maximum24003
Range23801
Interquartile range (IQR)49.67

Descriptive statistics

Standard deviation460.5104397
Coefficient of variation (CV)0.1895312281
Kurtosis2138.025559
Mean2429.733846
Median Absolute Deviation (MAD)18
Skewness45.50589334
Sum5483909.29
Variance212069.8651
MonotonicityNot monotonic
2022-11-08T23:29:27.324209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2405208
 
9.2%
2404165
 
7.3%
2455135
 
6.0%
2403132
 
5.8%
2406102
 
4.5%
245487
 
3.9%
245384
 
3.7%
245666
 
2.9%
240765
 
2.9%
240265
 
2.9%
Other values (133)1148
50.9%
ValueCountFrequency (%)
2022
 
0.1%
14041
 
< 0.1%
21333
 
0.1%
23692
 
0.1%
238211
 
0.5%
23839
 
0.4%
238424
1.1%
238536
1.6%
2385.51
 
< 0.1%
238629
1.3%
ValueCountFrequency (%)
240031
 
< 0.1%
26273
0.1%
25083
0.1%
24962
 
0.1%
24931
 
< 0.1%
24861
 
< 0.1%
24841
 
< 0.1%
24833
0.1%
24816
0.3%
24803
0.1%

7_day_str
Real number (ℝ≥0)

Distinct209
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.98697386
Minimum12
Maximum45.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size17.8 KiB
2022-11-08T23:29:27.687227image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile21.9
Q127.4
median31.2
Q334.7
95-th percentile39.2
Maximum45.9
Range33.9
Interquartile range (IQR)7.3

Descriptive statistics

Standard deviation5.298545274
Coefficient of variation (CV)0.1709926661
Kurtosis-0.1648309964
Mean30.98697386
Median Absolute Deviation (MAD)3.7
Skewness-0.08321793553
Sum69937.6
Variance28.07458202
MonotonicityNot monotonic
2022-11-08T23:29:28.087536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28.335
 
1.6%
27.732
 
1.4%
25.932
 
1.4%
34.631
 
1.4%
31.131
 
1.4%
35.929
 
1.3%
32.429
 
1.3%
31.429
 
1.3%
33.128
 
1.2%
26.828
 
1.2%
Other values (199)1953
86.5%
ValueCountFrequency (%)
122
 
0.1%
15.63
0.1%
15.71
 
< 0.1%
15.92
 
0.1%
17.94
0.2%
18.24
0.2%
18.34
0.2%
18.72
 
0.1%
197
0.3%
19.21
 
< 0.1%
ValueCountFrequency (%)
45.99
0.4%
45.82
 
0.1%
44.26
0.3%
42.66
0.3%
42.36
0.3%
42.16
0.3%
41.96
0.3%
41.84
0.2%
41.43
 
0.1%
41.26
0.3%

Interactions

2022-11-08T23:29:15.514539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:51.030610image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:54.299350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:56.860835image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:00.604662image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:04.180681image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:07.159643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:09.959420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:12.529541image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:15.776175image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:51.619909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:54.533718image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:57.131872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:01.127364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:04.876618image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:07.491819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:10.200707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:12.816022image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:16.136624image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:52.136683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:54.840575image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:57.442639image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:01.573278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:05.131512image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:07.754825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:10.471355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:13.205052image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:16.437618image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:52.575513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:55.102983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:57.779635image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:01.901355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:05.465356image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:08.150547image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:10.794156image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:13.585773image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:16.805218image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:52.913536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:55.361825image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:58.110674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:02.570006image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:05.741690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:08.428913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:11.060605image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:13.959168image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:17.082434image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:53.198023image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:55.668926image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:58.692811image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:02.984136image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:06.010931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:08.751928image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:11.305214image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:14.235699image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:17.349895image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:53.497957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:55.898786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:59.142649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:03.251704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:06.316952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:09.124033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:11.617280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:14.585848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:17.673405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:53.721143image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:56.178041image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:59.598927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:03.573157image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:06.581929image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:09.373308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:11.887832image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:14.856032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:17.964014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:53.997186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:28:56.532519image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:00.117942image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:03.837212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:06.851548image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:09.629462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:12.206524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-11-08T23:29:15.210342image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-11-08T23:29:28.391587image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Auto

The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.
2022-11-08T23:29:28.908535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-08T23:29:29.453105image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-08T23:29:30.629511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-08T23:29:31.075209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-08T23:29:31.465787image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-08T23:29:18.447634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-08T23:29:19.081835image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Type_of_course_AggregateType_of_Fine_Aggregate_Max._Size_of_Coarse_Aggregate_(mm)Cement_O.P.C_(Kgperm3)WaterCement_RatioWater_Content_(Kgperm3)Total_Aggregate_(Kgperm3)Fine_Aggregate_(Kgperm3)Coarse_Aggregate_(Kgperm3)Workability_Slump_(mm)Hardened_Concrete_Desnity_(avg.)7_day_str
0CrushedNatural403650.52225187071011601602407.021.8
1CrushedNatural403650.52225187071011601602403.026.3
2CrushedNatural203500.53185191572511901202475.032.7
3NaturalNatural203400.49165189583510601902412.030.3
4NaturalNatural203250.51165191084010701702404.027.0
5NaturalNatural203250.51165191084010701702405.027.0
6CrushedNatural203400.56190189579511001052425.031.8
7CrushedNatural203400.56190189579511001052427.031.8
8CrushedNatural203700.50185189579511001252449.037.8
9CrushedNatural203600.53190190078011201002454.028.2

Last rows

Type_of_course_AggregateType_of_Fine_Aggregate_Max._Size_of_Coarse_Aggregate_(mm)Cement_O.P.C_(Kgperm3)WaterCement_RatioWater_Content_(Kgperm3)Total_Aggregate_(Kgperm3)Fine_Aggregate_(Kgperm3)Coarse_Aggregate_(Kgperm3)Workability_Slump_(mm)Hardened_Concrete_Desnity_(avg.)7_day_str
224700203750.51190187575011251302444.026.9
224800203750.51190185575511301002454.024.5
224900203850.4821518106851125702414.022.3
225000203750.52195186057012001602435.041.0
225100203650.47170186570011651902404.034.8
225200203750.48180184566511801502404.024.8
225300203750.45170185572511301752404.034.7
225400203650.51185190080012001352454.035.4
225500203850.4718018156551160952386.025.9
225600203400.47160190076011401952402.025.9

Duplicate rows

Most frequently occurring

Type_of_course_AggregateType_of_Fine_Aggregate_Max._Size_of_Coarse_Aggregate_(mm)Cement_O.P.C_(Kgperm3)WaterCement_RatioWater_Content_(Kgperm3)Total_Aggregate_(Kgperm3)Fine_Aggregate_(Kgperm3)Coarse_Aggregate_(Kgperm3)Workability_Slump_(mm)Hardened_Concrete_Desnity_(avg.)7_day_str# duplicates
18400203750.49185190076011401752465.028.39
1100203500.46160189075511351952403.034.16
1600203500.47165188571511701802404.034.36
2600203600.46165186578510801702394.040.06
3000203600.46170188575011351502414.041.26
4700203650.45165187073011401652405.035.46
5400203650.47170186565512101602403.033.26
6000203650.49180189578011152202442.035.96
6300203650.51185190070012001602455.036.86
7700203650.52190188573511502002445.137.96